A Note on Interfacing Object Warehouses and Mass Storage Systems for Data Mining Applications
نویسندگان
چکیده
Data mining is the automatic discovery of patterns, associations, and anomalies in data sets. Data mining requires numerically and statistically intensive queries. Our assumption is that data mining requires a specialized data management infrastructure to support the aforementioned intensive queries, but because of the sizes of the data involved, this infrastructure is layered over a hierarchical storage system. In this paper, we discuss the architecture of a system which is layered for modularity, but exploits specialized lightweight services to maintain efficiency. Rather than use a full functioned database for example, we use light weight object services specialized for data mining. We propose using information repositories between layers so that components on either side of the layer can access information in the repositories to assist in making decisions about data layout, the caching and migration of data, the scheduling of queries, and related matters. Introduction Data mining is the automatic discovery of patterns, associations, and anomalies in data sets. The data mining of large data sets is a special challenge because the process requires numerically and statistically intensive queries on large amounts of data. Our assumption is that data mining requires a specialized data management infrastructure, but because of the sizes of the data involved, this infrastructure is layered over a hierarchical storage system. Our concern in this paper is an appropriate open, layered architecture to support this. A common layered architecture for this type of system is illustrated in Figure 1. There are three layers: the storage management layer, the data management layer, and the data mining and analysis layer. Unless these three layers coordinate how the data is physically laid out, how it is cached and migrated, and how it is prefetched, these layers can work at cross purposes and drastically impair the performance of the overall system. * This work was supported in part by the Massive Digital Data Systems Program.
منابع مشابه
Data Warehousing Applications: an Analytical Tool for Decision Support System
Data-driven decision support systems, such as data warehouses can serve the requirement of extraction of information from more than one subject area. Data warehouses standardize the data across the organization so as to have a single view of information. Data warehouses (DW) can provide the information required by the decision makers. The data warehouse supports an on-line analytical processing...
متن کاملSizing of a Packed Bed Storage for Solar Air Heating Systems (TECHNICAL NOTE)
Packed bed units generally, represent the most suitable storage units for air heating solar systems. In these systems the storage unit receives the heat from the collector during the collection period and discharges the heat to the building at the retrieval process. A method for sizing of packedbed storage in an air heating system is represented. The design is based on the K-S curves, which hav...
متن کاملTransient Two-Dimensional (r-z) Cyclic Charging/Discharging Analysis of Space Thermal Energy Storage Systems (RESEARCH NOTE)
A two-dimensional transient axi-symmetric model was developed to study the effects of various thermal and geometric parameters on cyclic heating and cooling modes of a phase-change thermal energy storage system. The high-temperature thermal energy storage device utilizes LiH for heat sink applications to store the waste heat generated during power-burst periods. The stored heat is then discharg...
متن کاملProcesses and apparatuses for formation, separation, pelletizing storage and re-gasification of gas hydrate
In recent years, the feasibility of utilizing gas hydrates in industrial systems draws much attention as a subject of engineering studies. Despite the suggested applications for gas hydrate in transportation and storage of natural gas, desalination of water, etc., there have been few applied industrial experiences with gas hydrates. There are several patents and papers on promotion of gas hydra...
متن کاملUsing a Data Mining Tool and FP-Growth Algorithm Application for Extraction of the Rules in two Different Dataset (TECHNICAL NOTE)
In this paper, we want to improve association rules in order to be used in recommenders. Recommender systems present a method to create the personalized offers. One of the most important types of recommender systems is the collaborative filtering that deals with data mining in user information and offering them the appropriate item. Among the data mining methods, finding frequent item sets and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996